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The problem of decentralized sequential change detection is con- 
sidered, where an abrupt change occurs in an area that is being mon- 
itored by a number of sensors. The goal is to detect this change as 
soon as possible at a central location (fusion center) which receives 
information from all sensors subject to quantization and rate con- 
straints. A novel decentralized sequential detection rule is proposed 
that requires communication from the sensors at random times and 
transmission of only low-bit messages. The second-order asymptotic 
optimality of the proposed scheme is established under different sta- 
tistical models for the sensor observations. Specifically, when each 
sensor process either has continuous paths and is continuously ob- 
served or it is a random walk, it is proved that the inflicted perfor- 
mance loss (with respect to the optimal detection rule that uses the 
complete sensor observations) is bounded asymptotically as the rate 
of false alarms goes to 0. The proposed scheme remains asymptoti- 
cally optimal but of first-order even if it induces an asymptotically 
low communication rate and there is an asymptotically large number 
of sensors. Finally, simulation experiments illustrate its efficiency in 
practice and its superiority over alternative decentralized detection 
rules that rely on communication at deterministic times. 

1. Introduction. Suppose that an area is being monitored by a num- 
ber of sensors which transmit their observations to a central location (fusion 
center). At some unknown time, an abrupt "disorder" occurs in the moni- 
tored area, such as an unexpected intrusion, and changes the dynamics of all 
sensors. Assuming that the sensors acquire their observations sequentially, 
the goal is to raise an alarm at the fusion center, using the transmitted 
messages from all sensors, as soon as possible after the occurrence of the 
change. 

When the sensors transmit their complete observations to the fusion cen- 
ter, this is the classical problem of sequential change detection, which has 
applications in many scientific and industrial fields, such as industrial quality 
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control [27], computer security [31] and many others. For the methodolog- 
ical and theoretical developments in sequential change detection we refer 
to the books by Baseville and Nikivorov [1]; Poor and Hadjiliadis [24]; as 
well as the review articles by Lai [10], [11]; Shiryaev [30]; Polunchenko and 
Tartakovsky [23]. 

In many modern application areas, such as mobile and wireless communi- 
cations and distributed surveillance systems, the sensors are typically low- 
power devices with limited energy, whereas their links with the fusion center 
are characterized by limited communication bandwidth [25], [36]. Therefore, 
in order to preserve the robustness of the network, it is necessary to limit 
the overall communication load and in particular the communication activ- 
ity of each sensor. This primarily implies a quantization constraint, i.e. each 
sensor should transmit a small number of bits each time it communicates 
with the fusion center, but also a rate constraint, i.e. each sensor should 
communicate with the fusion center at a lower rate than its sampling rate. 
In these cases, the problem at hand is first to decide the information that 
should be transmitted from the sensors to the fusion center, respecting the 
above constraints, and then to construct a sequential detection rule at the 
fusion center that relies on this information. We will call such detection rules 
decentralized, in contrast to the centralized ones that rely on the full sensor 
observations. 

There is a number of articles that study decentralized sequential detec- 
tion rules, for example we refer to Crow and Schwartz [4], Veeravalli [35], 
[36], Mei [14], Moustakides [18], Tartakovsky and Veeravalli [8], [32]. In most 
papers, the emphasis is placed on the quantization constraint and, typically, 
one-bit transmission is imposed. With respect to the rate of communication, 
two extreme cases have been considered in the literature. On the one hand, 
it is often assumed that each sensor transmits a quantized version of every 
observation it takes, i.e. the communication rate is equated with the sam- 
pling rate. On the other hand, one-shot schemes have been explored, which 
require each sensor to communicate with the fusion center at most once and 
to transmit a single bit of information. 

However, even if two detection structure rules require the transmission 
of only one-bit per communication, they may actually induce very differ- 
ent transmission activities, if the one communicates at a high rate and the 
other rarely. Therefore, comparing them may be misleading, especially in 
a decentralized setup. In order to highlight this point, we formulate a gen- 
eral framework for the problem of decentralized sequential change detection, 
which encompasses most of the schemes that have been proposed in the lit- 
erature. 
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The main contribution of this work is that we propose a novel decentral- 
ized detection rule and establish its asymptotic optimality under a large class 
of sensor dynamics. In particular, we suggest that each sensor communicate 
whenever its local log-likelihood ratio exits an interval and inform the fusion 
center with a low-bit message regarding the evolution of this statistic since 
the previous communication time. The fusion center then uses the transmit- 
ted messages in order to detect the change. Similar communication schemes 
have been used in the context of the decentralized sequential hypothesis 
testing problem by Fellouris and Moustakides [6] and Yilmaz et al. [38]. As 
a result, the communication activity of each sensor is completely controlled 
through the selection of three parameters, the upper and lower bounds of 
the interval and the number of bits the sensor transmits per communication. 

When the local log-likelihood ratios have continuous paths or they are 
random walks, we show that there is only bounded performance loss with 
respect to the optimal centralized detection rule as the period of false alarms 
goes to infinity (second-order asymptotic optimality). Moreover, we show 
that the proposed scheme remains asymptotically optimal (of first-order) 
even when it induces an asymptotically low communication rate and there 
is an asymptotically large number of sensors. Finally, we illustrate with 
simulation experiments its superiority over a decentralized detection rule 
that requires communication at deterministic times. 

The structure of the remaining paper is the following: in Section 2, we 
formulate the problems of centralized and decentralized sequential change 
detection. In Section 3, we describe the main decentralized schemes in the 
literature. In Sections 4 and 5, we define and analyze the proposed scheme in 
continuous and discrete time, respectively. In Section 6, we summarize our 
results and state our conclusions. We prove the main results of the paper, 
as well as some supporting lemmas, in Appendices A, B and C. 

2. Sequential Change Detection. Let {(£( := ^, . . . , £jf )} be a K- 

dimensional stochastic process, where £q := for every 1 < k < K and 
time is either discrete (t E N) or continuous (t £ [0,oo)). The interpre- 
tation is that £ fc is the process observed by sensor k. Thus, if {J^ } is 
the local filtration at sensor k and {^t} the global filtration, then it is 
:= cr(£g, < s < t) and J^j := Vk^t f° r every t. Moreover, it is un- 
derstood that we work with the right-continuous versions of these nitrations 
whenever time is continuous. 

Let Po and be two probability measures on the canonical space of £ 
that are mutually absolutely continuous on any <r-algebra We denote by 
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ut the corresponding log-likelihood ratio up to time t, that is 

dP 



(2.1) ut := log 



dP r 



We assume that at some unknown, deterministic time r > 0, the distribution 
of £, which we denote by P T , changes from Poo to Po- Therefore, P T coincides 
with Pqo on & t when t £ [0, r] and is absolutely continuous with respect to 
Poo on jF t when t > r so that 



dP 

(2.2) log dK 



dP r 



= Ut — U T , t > T. 



2.1. TTie centralized setup. The problem of classical (centralized) change 
detection is to find an {j^}-stopping time that has small detection delay 
and rare false alarms, i.e. T should take large values under Poo and T — r 
should take small values under P r . However, finding the optimal detection 
rule depends on how detection delay and false alarms are quantified. There 
are different approaches to the sequential change detection problem, such as 
the Bayesian formulation due to Shiryaev [28] (see also Peskir and Shiryaev 
[21], Gapeev [7], Dayanik et al. [5] and Sezer [26]) and the minimax for- 
mulation due to Pollak [22] (see also Poluchenco and Tartakovksy [23] and 
Tartakovsky et al. [33]). In this work, we focus on the formulation suggested 
by Lorden [13], where the performance of a detection rule T is measured by 
its worst-case (with respect to r) conditional expected delay given the worst 
possible history of observations up to r, 

(2.3) J L [T] = sup ess sup E T \{T - t) + \3 

and the optimal detection rule was defined as the solution to the following 
constrained optimization problem 

(2.4) inf J L [T\ when EootTl > 7, 

where 7 is a positive constant, fixed in advance by the designer of the scheme. 
In other words, the goal in this strongly min-max approach is to minimize 
the detection delay under the worst-case scenario with respect to both the 
changepoint and the history of observations before the change, while con- 
trolling the period of false alarms above a desired level, 7. 

A related formulation was considered by Moustakides in [17], according 
to which the performance measure J/l is replaced by 

(2.5) Jm[T] := sup esssup E r ((u)j- — (nO 

T>0 LV ' 
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and the optimal detection rule is defined as the solution to the following 
optimization problem 

(2.6) inf J M [T\ when £oo[{u) T ] > 7, 

where (u)t is the quadratic variation of the log-likelihood ratio at time 
t. Thus, in this formulation, detection delay and false alarm rate are not 
measured in terms of the actual time, but in terms of the expected accu- 
mulated quadratic variation until the alarm. The latter criterion also has 
an appealing interpretation in terms of Kullback-Leibler divergence, since 
Eo[«t] = Eo[(w)t] and Eoo[— uf] = E do [(u)t-], whenever these quantities 
are finite. However, the main advantage of the latter formulation is that it 
admits a solution for much richer class of dynamics. 

In order to be more precise, let us present the Cumulative Sums (CUSUM) 
test, which was introduced by Page [20] and can be defined as follows: 

(2.7) S := inf{t > : yt > u}, where yt := Ut — inf u s , 

0<s<t 

and v > is a fixed threshold. When {i^j^N is a random walk, it is well- 
known (see Moustakides [15], [16]) that the CUSUM test solves Lorden's [13] 
optimization problem (2.4), as long as v is chosen so that the false alarm 
constraint be satisfied with equality, that is E^tS] = 7. In continuous-time, 
when the process {ut}t>o has continuous paths and the following condition 
is satisfied 

(2.8) lim (u) t = 00 Po,Poo-a.s., 

i— >oo 

the CUSUM test solves (2.6), as long as its threshold v is now chosen so that 
EooK^s] = 7 (see Moustakides [17] and Chronopoulou and Fellouris [3]). A 
direct consequence of the latter optimality result is that the CUSUM test 
is also optimal with respect to Lorden's original criterion (2.4) when ut has 
continuous paths and (u)t is proportional to t. This is the case for example 
when each £ fc is a fractional Brownian motion (fBm) with Hurst index H 
before the change and adopts a polynomial drift term with exponent H+l/2 
after the change. In the special case H = 1/2, this implies the optimality of 
the CUSUM test when each £ fc is a Brownian motion that adopts a linear 
drift after the change, which had originally been shown by Shiryaev [29] and 
Beibel [2]. 

In what follows, in order to work with a common criterion, we quantify 
the performance of a detection rule T with 

(2.9) J\T\ := SU P ess sup E T (uj- — u r )l{7- >r i | ^ T 
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and we define the optimal detection rule as the solution to the following 
optimization problem 

(2.10) inf J[T] when E^-ur] > 7. 

When {ut} is a random walk, this problem is equivalent to Lorden's opti- 
mization problem, defined by (2.3)-(2.4), as long as we restrict ourselves to 
integrable stopping times under Po, Poo- Similarly, when {ut} has contin- 
uous paths, this problem is equivalent to the modified version of Lorden's 
criterion, defined by (2.5)-(2.6), as long as we consider stopping times that 
satisfy E 00 [(«)7-] < 00 and Eo[(u)-7-] < 00. Therefore, under all dynamics for 
which it is known to have an exact optimality property, the CUSUM test 
solves the problem defined by (2.9)-(2.10), given that its threshold is chosen 
so that Eqo [-us] = 7, which will be our standing assumption from now on. 

2.2. The decentralized setup. In a decentralized setup, the goal is first to 
select a communication scheme subject to quantization and rate constraints 
and then to find a detection rule that is adapted to the filtration that is 
induced by the chosen communication scheme. 

More specifically, we define a decentralized sequential detection rule as a 
pair ({^t},7~), where T is an {j£" t }-stopping time and {^t} is a filtration 
of the form 

(2.11) # t := a((r*, z k n ) : r k < t, k = 1, . . . , K), 

where each {r k } n ^ is the sequence of communication times for sensor k and 
z k is the message transmitted to the fusion center at time r k . Each r k must 
be an {J^j-stopping time and each z k an ^ k k -measurable random variable 
that takes values in a finite set, so that a small number of bits is required for 
its transmission to the fusion center. Moreover, since many applications are 
characterized by limited storage capacity, we require in particular that each 
z k is measurable with respect to the cr-algebra generated by the observations 
at sensor k between its n — 1 and nth transmission, that is o~(£, k , T k _ l < s < 

T k ). 

Note that this framework allows only one-way communication from the 
sensors to the fusion center, thus it forbids any communication between 
sensors or feedback from the fusion center to the sensors. Indeed, such pos- 
sibilities impose a much heavier communication load in the network and 
raise questions regarding the design of the network architecture, which we 
do not consider here. Furthermore, under the assumption of independence 
across sensors, it is intuitive that such possibilities should be redundant and 
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one of the goals of this work is to show that this is indeed the case. For 
decentralized detection rules that require feedback we refer to Veeravalli 



2.3. Asymptotic optimality. Ideally, we would like to find the best de- 
centralized detection rule, optimizing with respect to both the fusion center 
filtration and stopping time. Such an optimization problem is highly in- 
tractable, even if one makes a number of simplifying assumptions [35]. For 
this reason, we will use the centralized CUSUM as the ultimate bench- 
mark and compare any decentralized detection rule against it. We can only 
hope that such a detection rule attains the optimal centralized performance 
asymptotically, that is for large periods of false alarms. 

Thus, in what follows, if ({^},T) is an arbitrary decentralized detec- 
tion rule satisfying the false alarm constraint — ^.^[n-j-] > 7 and S is the 
centralized CUSUM rule satisfying the false alarm constraint with equality, 
i.e. — Eoof-us] = 7, we will say that T is asymptotically optimal 



as 7 — > 00. Note that, contrary to order-1, order-2 asymptotic optimality 
guarantees that the performance loss of T remains bounded as 7 — > 00. Of 
course, it is even better if the performance loss vanishes as 7 — > 00, which 
is the case of order-3 asymptotic optimality. Undoubtedly, order-3 implies 
order-2 which implies order-1, since J\T\, J\S\ — > 00 as 7 — > 00. 

3. Existing decentralized schemes. The main decentralized detec- 
tion rules encountered in the literature can be classified into two main cate- 
gories. In the first, the sensors transmit systematically compressed versions 
of their data to the fusion center and the latter combines these quantized 
messages in order to detect the change. In the second, each sensor detects 
individually the change and the fusion center combines the local sensor de- 
cisions. 

In order to describe them in more detail, we need to introduce some 
additional notation and assumptions. Thus, we denote by Pq and P^ the 
post and pre-change measure of £ fc and by v% their log-likelihood ratio up 
to time t, that is 



[35]. 



• of order-1, if J[T]/J[S] -> 1, 

• of order-2, if J[T] - J[S] = 0(1) 

• of order-3, if J[T] - J[S) = o(l), 
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Moreover, we assume that the following local and average Kullback-Leibler 
information numbers are positive and finite 

(3.2) I k := E [u k ] , /* := -E^uf], I := ^Eo[«i] , /«, ■= ^[-m]. 

Furthermore, we assume that the sensor processes are independent, that is 
Poo := PL x • • • x Poo an d P T := Pr x • • • x P^ f° r an y T i which implies 

(3.3) u t := u\ + ...+uf, t>0 

and consequently J = ^ X)feLi J o and = ]? Z)feLi J £>- 

3.1. Q-CUSUM. Suppose that each sensor transmits to the fusion center 
quantized versions of its local log-likelihood ratio process at deterministic 
equidistant times. Thus, if r is the communication period and for each sensor 
k we consider the alphabet {1, . . . , b k }, where b k > 2 is an integer, then it 
will be 

(3.4) r k = rn and z k = j when rji < uK - u k k < rf, 

n T n-1 J 



with j = 1, . . . , b k , where — oo = Tq < T k < ... < T k k = oo are fixed 
thresholds, chosen by the designer. When the sensors take continuous-time 
observations, the communication rate 1/r is clearly smaller than the (infi- 
nite) sampling rate (for any positive number r). On the other hand, when 
the sensors take discrete-time observations, the communication period is r 
times larger than the sampling period, where now r > 1 is an integer. 

The communication scheme (3.4) induces synchronous communication to 
the fusion center, which receives at each time r k = rn the ET-dimensional 
vector (z^, . . . , z^). If we additionally assume that each {u k } has stationary 
and independent increments, then a natural detection rule at the fusion 
center is the corresponding CUSUM stopping time 

(3.5) S := r ■ inf{n e N : y n > z>}, 

where the threshold v is chosen so that the false alarm constraint is satisfied 
with equality and the CUSUM statistic {y n } admits the following recursion: 



(3.6) y n :=(y n _!) + + ^^ 

k=l j=l 



l{4=i} 10 gp- 



>(z 



A- 



, yo ■= o, 



One might claim that, given the quantization rule (3.4), the CUSUM rule 
defined in (3.5)-(3.6) is the best we can do at the fusion center. We should 
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note however that this claim we know it to be true only when r = 1. When 
r > 1, the messages sent to the fusion center are not i.i.d. before and after the 
change, which is the crucial property for the optimality of CUSUM. Indeed, 
the change can take place between two consecutive communication instances, 
thus generating non i.i.d. data. If of course one could demonstrate that the 
worst-case scenario is the change to occur at a communication instant, then 
this would clearly establish optimality for the CUSUM test at the fusion 
center for any r > 1. 

We call this detection scheme Q-CUSUM where Q stands for the "quan- 
tization" employed by this method. Note that we have to multiply by r in 
(3.5) in order to return to physical time units, since the samples are at a 
rate 1/r. It is straightforward to see that as 7 — > 00 

(17) J[S] rl f 1 , k _ , Po(4=i) 

where Jo is the average Kullback-Leibler information number defined in 
(3.2). Therefore, the asymptotic performance of S can be optimized by 
choosing the thresholds {Tj} in order to maximize Iq. However, it is well- 
known (see for example [34]) that for any choice of thresholds, communi- 
cation period and alphabet size, it is rlo > Io, therefore S is not even 
asymptotically optimum of order- 1. 

In the case that the sensors take discrete-time observations, this detection 
rule has been studied in [4], [14], [18], [32] under the assumption that r = 1, 
i.e. when each sensor communicates with the fusion center at every obser- 
vation time. In the case that the sensors observe Brownian motion paths in 
continuous-time, the performance of this rule was explored in [18]. 

3.2. Fusion of local CUSUM rules. We now consider the class of fusion 
center detection rules that rely on the local decisions of the sensors. More 
specifically, we assume that sensor k communicates at the following times 

(3.8) r n fc =inf{t>rt i: ^>c fc }, 

where y\ := u\— mino< s <t u k s is the local CUSUM statistic at sensor k and c k 
is a positive threshold. With this communication scheme, a sensor transmits 
a message to the fusion center only to announce that it has detected a change 
and this requires only one bit per transmission. Thus, even if the network 
can support the transmission of multi-bit messages, this additional flexibility 
is not going to be useful for this communication scheme. 
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There are many reasonable fusion center policies that can be based on 
(3.8). For example, the fusion center may raise an alarm at the first time 
some sensor communicates, i.e. at min^r^ (min-CUSUM). Alternatively, the 
alarm can be raised when every sensor has communicated at least once, i.e at 
maxfc t\ (max-CUSUM). It is clear that both schemes require minimal com- 
munication activity in the network, as they require the transmission of at 
most one bit from each sensor. The exact performance of these two schemes 
was computed in [18] in the case that each sensor observes a Brownian path. 
In the case of discrete-time i.i.d observations, it was shown in [32] that they 
both have the same first-order asymptotic performance. However, numeri- 
cal experiments in both papers suggest that the min-CUSUM performs in 
practice much better than the max-CUSUM. Of course, as one would ex- 
pect, both schemes are asymptotically suboptimal in all the above cases. 
(We should note however that the min-CUSUM is an appealing detection 
rule in the case that the change may occur in at most one sensor, see for 
example [8]). 

An alternative detection rule that is based on the communication scheme 

(3.8) is to raise an alarm the first time that all sensors communicate at the 
same time, that is, at 

(3.9) M := inf{t : y\ > c*, V k = 1, . . . , K}. 

Thus, contrary to the previous one-shot schemes, each sensor now keeps 
transmitting messages to the fusion center, even after it has detected a 
change, until they all agree simultaneously that the change has indeed oc- 
curred. In this way, the induced communication activity is intense only after 
the change has occurred. Before the change, a sensor communicates only to 
report a local false alarm, which is a rare event. 

This rule was suggested and analyzed by Mei 1 in [14], where it was shown 
that when each {u^} is a random walk whose increments have a finite second 
moment, A4 is asymptotically optimal of first order, as long as each thresh- 
old c k is chosen to be proportional to Iq (the constant of proportionality 
is determined by the false alarm constraint). This is the only asymptoti- 
cally optimal decentralized detection rule that is known in the literature 
so far. However, A4 is not asymptotically optimal of second order, since 
i7[A4] — J[S\ = ©(ydog 7) as 7 — > 00 [14]. Moreover, despite its asymptotic 
optimality, A4 can be inefficient in practice, especially in the case that K is 

1 The presentation of M is different in [14] where it is assumed that each sensor must 
communicate at every observation time t the outcome of the event {y^ > c k }. However, it 
is easy to realize, that communication is needed only when the local statistic exceeds the 
local threshold. 
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large. These points were illustrated numerically in [14] and in [32]. Finally, 
we should note that, contrary to Q-CUSUM, Ai does not have any degrees 
of freedom that allow the designer to control the induced communication 
activity, as the design parameters {c k } in (3.9) are completely specified by 
the desired period of false alarms. 

3.3. Comparisons. The min-CUSUM and Mei's scheme have been com- 
pared numerically with Q-CUSUM in a discrete-time setup, when in the 
latter scheme each sensor transmits one-bit messages (b k = 2) at every ob- 
servation time (r = 1). Under these assumptions, it has been reported (see 
[14], [32]) that Q-CUSUM typically performs better than the min-CUSUM 
(see [32]) and that M performs worse than Q-CUSUM in the case of large 
sensor networks. 

In the case that each sensor observes a Brownian path, it was shown 
numerically in [18] that the performance of Q-CUSUM does not improve 
with a very low communication period r. Moreover, it was shown that min- 
CUSUM and Q-CUSUM have essentially the same performance when in the 
latter each sensor transmits one-bit messages (b k = 2). 

The above comparisons offer some important insights, however they do 
not take into account the fact that the compared schemes have very differ- 
ent communication activities. Therefore, these comparisons may not be very 
informative, since the goal in the decentralized setup is to optimize the per- 
formance of the detection rule while controlling the overall communication 
load in the network. 

3.4. D-CUSUM; A novel scheme. From the previous discussion it is clear 
that it remains an open problem to find an asymptotically optimal and ef- 
ficient decentralized detection rule, whose communication rate can be con- 
trolled and whose efficiency can be preserved even with a low communication 
rate or a large number of sensors. Our contribution in this work is that we 
propose and analyze a novel decentralized detection rule with these charac- 
teristics. 

The proposed scheme is based on threshold quantization of the local log- 
likelihood ratios and a CUSUM-like rule at the fusion center, just like Q- 
CUSUM. Its difference is that the sensors communicate asynchronously with 
the fusion center, at random instead of deterministic times, in particular at 
two-sided exit times of the local log-likelihood ratios. Due to this characteris- 
tic, the proposed detection rule turns out to be asymptotically optimal, con- 
trary to Q-CUSUM. Actually, it can attain a much stronger (second-order) 
asymptotic optimality than Mei's scheme, whereas its asymptotic optimality 
is preserved even with an asymptotically low communication rate and a large 
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sensor network. Finally, these theoretical properties are accompanied by a 
very good performance in practice, as our simulation experiments suggest. 

The design and analysis of the proposed scheme differ significantly de- 
pending on whether the sensors take discrete or continuous time observa- 
tions. Therefore, we will consider these two cases separately in the next 
two sections. However, we will see that the properties of the suggested rule, 
which we will call D-CUSUM, turn out to be very similar under both setups. 

4. D-CUSUM in Continuous-Time. In this section, t E [0, oo) and 
each log- likelihood ratio process {u k }t>o is assumed to have continuous paths 
for every k = 1, . . . , K. Moreover, we assume that condition (2.8) is satisfied, 
which guarantees the optimality of the centralized CUSUM test, S. We also 
have the following closed- form expressions (see [17], [3]) 

, s -f = E 00 [-u s ] = E 00 [(u)s]=e ,/ -u-l 

1 - ' J[S] = EqM = E [(u) s ] =e-" + v-l. 

Moreover, we assume that the sensor processes are independent, thus (3.3) 
holds. However, we will see that this assumption can be removed in the case 
that the sensors observe correlated Brownian motions. 

Our goal in this section is to describe the continuous-time version of the 
proposed scheme and compare its performance with that of the optimal 
centralized CUSUM test. 

We suggest that each sensor k communicates with the fusion center at the 
sequence of {j^j-stopping times that is defined by the following recursion: 



(4.2) r k := inf{t > r k _, : u k - u k k $ (-A fe , A fe )}, n G N; r k := 0, 

n — 1 

where the thresholds A k , A k are fixed, positive constants, known to sensor 
k and the fusion center. We denote by i\ := u k k — u k k the accumulated 

T n T n-1 

log-likelihood ratio at sensor k in the time-interval [r k i,t\]. Then, due to 

the path-continuity of {u k }, it is clear that each £^ will be exactly equal to 
either A k or — A k . Therefore, if the fusion center receives at r k the following 
one-bit message by sensor k 



k 



(4.3) z* 



1, if^ = A fe 
-1, if^ = -A fc 



it learns the exact value of £ k , since £ k = A k t^ z k = iy — A k t^ z k = _ 1 y. As a 
result, the fusion center is able to recover u\ at every communication time 
t = r k , since u k k = l\ + . . . + £ k . Then, a natural approximation for u k at 
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some arbitrary time t is the corresponding most recently reproduced value, 
i.e. 

(4.4) u*:=J2tt ™>t ■= max{n : T k < t}. 

n=l 

Finally, mimicking the centralized CUSUM rule, we propose the following 
detection rule at the fusion center 



K 

(4.5) S := inf{t > : y< > u}, where yt := ut — inf u s , u t := \^ u k 

0<s<t 



k=l 



and the threshold v is chosen so that E^f— ug\ = 7. 



4.1. Design and implementation. The proposed scheme has a number 
of practical advantages. First of all, the fusion statistic {yt} is piecewise- 
constant and needs to be updated only at the communication times from 
the sensors, according to the following convenient recursion formula: 

(4.6) y T , = (y T ,y + A k l {z , =1} - A k l {z , = _ l} . 

In other words, whenever it receives a message from sensor k, the fusion 
center simply needs to add A k or —A k to the positive part of the current 
value of its statistic, {yt}- Compare this with the centralized, continuous- 
time CUSUM statistic {yt}, which does not have this nice property (unless 
the sensors observe Brownian motions) and whose calculation at the fusion 
center requires high-frequency transmission of "infinite-bit" messages from 
the sensors. 

The thresholds A k ,A k control the communication rate of sensor k, thus 
they should ideally be chosen in order to attain target values for the expected 
inter-communication times, ^■o[T k — T k _ 1 ] and Eo^r^ — r^_ 1 ]. However, since 
these expectations in general depend on n, we propose instead to select 
A k ,A k to attain target values for E [^] and Eoo[— 

These quantities represent the expected accumulated Kullback-Leibler 
divergences between the post and pre-change measure in the path of £ fc 
during the time-interval [t ) ^_ 1 ,t^] and they do not depend on n, since 
Eo[£*] = s(A k ,A k ) and -E^f*] = s{A k ,A k ), where 

In this way, the specification of A k , A k requires only the solution of a system 
of two non-linear equations. 
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4.2. Asymptotic optimality. Prom the previous discussion it should be 
clear that, from a practical point of view, D-CUSUM is much more prefer- 
able than the corresponding centralized CUSUM. Our goal in this section is 
to show that it also has excellent performance characteristics, making any 
additional benefit of the optimal centralized CUSUM test negligible relative 
to its implementation cost. The following theorem is crucial in this direction, 
as it provides a very useful, non-asymptotic upper bound on the performance 
loss of the proposed detection structure. 

Theorem 1. For any 7 and {A k ,A_ k }i<k<K we have 
(4.8) J[S]-J[S] <41fA max , 

where A max := max.i< k < K {K k , A k }. 

Proof. The proof is presented in Appendix A. □ 



Let us discuss the implications of Theorem 1. The bound provided in (4.8) 
implies that for any fixed thresholds {A k ,A k } and any number of sensors K, 
the performance loss of S is bounded as 7 — > 00, in other words S is asymp- 
totically optimal of order-2. There are also interesting conclusions that we 
can draw for the case of a large sensor-network (K — > 00). In particular, 
if we let K — > 00 and A max — > so that KA max = 0(1), then S remains 
asymptotically optimal of order-2. In other words, with a very large sensor 
network, second-order optimality is preserved if there is a sufficiently high 
rate of communication from the sensors to the fusion center. Of course, we 
can do even better by letting K — > 00 and A max — > so that i^A max = o(l), 
in which case D-CUSUM becomes asymptotically optimal of order-3 and 
its distance from the optimal centralized CUSUM vanishes asymptotically. 
However, since we want to avoid the frequent communication activity that 
is induced by letting A max — >■ 0, it is more interesting to see that S remains 
asymptotically optimal (of order- 1) in an asymptotically large sensor net- 
work {K — > 00) and/or under an asymptotically low communication rate 
(A max — > 00), as long as /fA max = o(log7). Indeed, from (4.1) and (4.8) we 
have 



max 



J[S\ J[S] - e-v + v-l 



and our claim now follows from (4.1), which implies that v = log 7 + o(l). 
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4.3. The case of correlated sensors. Going over the proof of Theorem 1 in 
Appendix A, we realize that the assumption of independence across sensors 
is needed only to the extent that it guarantees a decomposition of the form 
ut = J2k=i u t j where {u k } is an J^ fc -adapted process with continuous paths. 
Indeed, we did not use at all the fact that {u k } is the local log-likelihood ratio 
at sensor k. This implies that the previous results (and the corresponding 
analysis) are valid even for sensors with correlated dynamics, as long as such 
a decomposition is possible. This is for example the case when the sensors 
observe correlated Brownian motions before and after the change, i.e. 

K 

(4.10) £t = J2 °kjWi + l {t>T} » k t, t > 0, k = 1, . . . , K, 

i=i 

where (W 1 , . . . , W K ) is a standard K-dimensional Wiener process, \i = 
[(J, 1 , . . . , fJ, K ]' a if-dimensional real vector and a := [o~ij] a square matrix 
of dimension K so that the diffusion coefficient matrix £ = era' is in- 
vertible. Then, we can write ut = Ylk=i[b k £t ~ 0.5/x fc 6 fc t], where b = 
[b 1 , . . . ,b K ]' = S^ 1 ^, and Theorem 1 remains valid as long as we replace 
in the definition (4.2) of the stopping times (r^) the local log-likelihood 
ratio, u\ = fi k $ - 0.5 (//) 2 t, by b k $ - 0.5 fi k b k t. 

5. D-CUSUM in Discrete-Time. In this section, i G N and the in- 
crements {u k — tt^_ 1 }t e N are independent and identically distributed ran- 
dom variables with a finite second moment under both Pg and for every 
1 < k < K. Moreover, we assume that the sensor processes are independent, 
thus (3.3) is valid. 

Our goal in this section is to describe the design and optimality proper- 
ties of the proposed scheme in this discrete-time setup, emphasizing on the 
differences and similarities with the continuous-time setup. 

5.1. Proposed decentralized detection rule. As in the previous section, we 
assume that sensor k communicates with the fusion center at the following 
stopping times 

(5.1) r k := M{t > r k _, : u k - u k k $ (-A fc , A k )}, n G N; r k := 0, 

n — 1 

where A k ,A k are fixed, positive thresholds, and we set 

£ k := uK -uK , n £ N. 

'n 'n-1 

However, unlike the previous section, is not restricted to the binary set 
{A k , -A fc }, but takes values in (— oo, -A k ] U [A k , oo). As a result, the fusion 
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center cannot recover ^* exactly when sensor k can transmit only a finite 
number of bits. This additional source of performance degradation obligates 
us to modify the scheme we proposed in continuous time, both in terms of 
the messages that are transmitted to the fusion center, as well as in how the 
fusion center utilizes these messages. 

First, we need to generalize the definition of transmitted messages in order 
to allow for the incorporation of richer alphabets. Thus, we assume that the 
alphabet at sensor k is of the form {— d k , . . . , —1, 1, . . . , d k }, where d k > 1 
is some positive integer, and we suggest that sensor k transmits at time r k 
the following message to the fusion center 



(52) z k -l h if e 1-i<^-A*<^ 



where j = l,...,d k . The thresholds {e k ,e k ,j = 1, . . . ,d k — 1} are selected 
so that 



(5.3) 



P (£ k n - A k >e k \e k > A k 



J_ 

d k 



Poo(^ + A fc < -e k \e k < -A k ) = 1 3 



d kl 



whereas = := 0, e k dk := essupuf and e^ fc := essup(— u\). In this way, 
the overshoot £ k — A k is equally likely to lie in each interval ef) and, 

similarly, —(£ k + A k ) is equally likely to lie in each interval (e k _i,e k ), where 
j = l,...,d k . 

The second modification is in the way the fusion center approximates the 
log-likelihood ratio u k at some arbitrary time t. In the previous section, we 
used the value of u k at the most recent communication time from sensor k, 
recall (4.4). This approximation was possible due to the path-continuity of 
{u k }, which allowed the fusion center to learn the exact value of each binary 
random variable Since this is no longer the case in discrete time, we now 
define 

ml 

(5.4) u k := ^ it rn k := max{n : r k < t}, 

n=l 

where t\ is now an approximation of £^ that relies on r k — r k _ 1 and z k , 
the information that becomes available to the fusion center at time r k . A 
straightforward choice is to define 1^ as the log-likelihood ratio of the pair 
(r k — T k _ l ,z^ l ). Unfortunately, this is not possible in practice, since the 
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distribution of the inter-communication times (r* — r^_ 1 ) ng pj is typically 
intractable. For this reason, we follow a partial likelihood approach and define 
as the log- likelihood ratio of z k : 



d k 

(5.5) ^:=J2\^h^=j}-^hK-i) 



where 



i=i 



(5-6) Ajf:=los ^ _ y ~ A? := log fc _ . 

Note that since {n^jtgN is a random walk, the messages (z k ) n& ^ are iid, 
which is why A k and A^ do not depend on 



n. 



The proposed detection rule at the fusion center will then have the same 
form as in (4.5), i.e. S := inf{i G N : yt > u}, where yt := u t — info< s <t u s , 
ut ■= Y^k=i > an d the threshold f is assumed to satisfy E. OQ [ — Ug\ — / y. 



5.2. Design and implementation. As in the previous section, A fc and A k 
control the communication activity of sensor k. Since each {u k } is a random 
walk, we have E [t^ -t^_ x ] = E [rf] and E^fr* -r*^] = E^rf], therefore 
A fc and A fc can be selected so that target values for Eo[rf] and E^rf] be 
attained. This can be done for example with a stochastic approximation 
algorithm, since the latter quantities are not in general available in closed- 
form but can be approximated with simulations. 

According to (5.3), thresholds {e^} and {e^} are percentiles of i\ — A k 
and — (£ k + A k ), thus their computation simply requires the simulation of 
l\. 

After having specified A fc ,A fc and {e k ,e k ,j = l,...,d k — 1}, the next 
step is to compute {A k ,A k }, which can also be done with simulations, since 
these quantities do not admit closed-form expressions. However, this is not a 
straightforward task if one uses their definition in (5.6), since this expression 
requires the simulation of rare events. Nevertheless, we can overcome this 
problem using the following lemma. 

Lemma 1. For every 1 < j < d k , 

R) := A k — A k = — log E [e-«- A ?) | z k = j], 



(5.7) 



■= A* - A* = - log Eoo [e^ | 4 = -j] . 



where 



(5.8) A* := A k + e k A k := A k + e k _ 



Proof. The proof of this lemma can be found in Appendix B. □ 
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5.3. The error due to quantization. The transmission of z k requires the 
communication of |~log 2 (2(f fc )] = 1 + |~log 2 d k ] bits. When d k = 1, z n is a 
one-bit message that simply informs the fusion center whether £^ is above 
A k or below — A k . When d k > 2, z k also informs the fusion center regarding 
the size of the overshoots l\ — A k or £ k + A fc , mitigating in this way the 
error due to quantization. The following lemma makes this observation more 
precise, providing an upper bound for the quantization error in an arbitrary 
transmission to the fusion center. 

Lemma 2. For any k,n we have 

(5.9) & n - t k n <Vn--=Yl [( £ n ~ A ") 1 {4=.} + Kjh^-j} ■ 

i=i 

and Eo[r} k ] < 9 k , where 9 k is a constant that depends on d k so that 9 k — > 
as d k — >■ oo. 

Proof. The proof of this lemma can be found in Appendix B. □ 

Although it may seem counterintuitive at first, we expect that when d k = 
1, a very high rate of communication for sensor k (small {A k ,A k }) will 
not be desirable, since it may lead to a fast accumulation of quantization 
error. However, as the previous lemma suggests, this will not be the case 
with larger alphabets. This intuition will be supported by our asymptotic 
analysis in the next subsection. 

Quantizing the overshoots presents another very important characteristic 
that should be mentioned. Clearly the statistical behavior of the overshoots 
depends on the two main thresholds A k , A k , which control the average pe- 
riod the sensor communicates with the fusion center. However, this depen- 
dency is only minor since the pdf of the overshoots converges to some lim- 
iting pdf as these thresholds become large. In other words, quantizing the 
overshoots is like quantizing a random variable with (almost) fixed statis- 
tics. Consequently, the mean square quantization error (or any other similar 
quality measure), for fixed number of bits, will be (almost) independent from 
the two thresholds A k ,A k . 

Let us now turn to the classical quantization scheme (3.4) employed by 
Q-CUSUM where quantization is applied on the value of u^ r — wL_i\ r , where 
nr denotes the times the sensor communicates with the fusion center and 
r the corresponding period. It is very simple to realize that for fixed num- 
ber of bits, if we increase the period r, the mean square quantization error 
will increase since the difference u\ r — u k n _^ r will involve a larger sum of 
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i.i.d. random variables. This becomes particularly obvious when these ran- 
dom variables are bounded, in which case the support of the sum increases 
linearly with r and we are asked, with the same number of bits, to quantize 
a larger range of values. This suggests that if we like to communicate with 
the fusion center at a smaller rate and preserve the same number of bits, this 
will inflict larger quantization errors and therefore additional performance 
degradation. This is clearly not the case with the quantization scheme we 
adopt for D-CUSUM since increasing A k ,A k (to reduce the communication 
rate) leaves the mean square quantization error almost intact. 

5.4. Asymptotic Optimality. Our goal in this section is to draw conclu- 
sions regarding the optimal design and asymptotic behavior of S as 7 — > 00. 
For simplicity of presentation, we assume that there is a quantity A so that 
A k ,A k = 9(A) as A,A fc ,A fc -»■ 00 for all 1 < k < K, which means that 
the rate of communication, before and after the change, for all sensors is of 
the same order. Furthermore, we use := maxi<fc<^ 9 k as the parameter 
that controls the quantization error in any transmission to the fusion center, 
where we recall that 6 k appears in Lemma 2. 

Theorem 2. As 7 — > 00 we have 



Proof. The optimality of the CUSUM test implies that J[S] > J[S], 
therefore it suffices to prove the second inequality in (5.10). For both the 
optimum CUSUM S and D-CUSUM S we know that J[S] = E [u s ] and 
>J[S] = Eofuc;], therefore we can write 



(5.11) J[S] - J[S] = E [us] - E [u s ] = E [u s - u s ] + E [u s ] - E [u s }. 



Thus, we need to provide suitable bounds for the three terms in the right 
hand side of (5.11). Indeed, from Lemma6 we have Eolus] > log7 — KQ(1), 
whereas from Lemma 7 we have EofS^] < log 7 + KQ(A). Moreover, from 
Lemmas 8 and 9 we obtain 



(5.10) 



< j[s\ - j[s] < e 



log 7 
9(A) 



+ if 9(A). 



E [u s -Ug] <KG(A)+6 



log 7 
9(A)' 



Applying these inequalities in (5.11) we obtain the desired result. The above 
lemmas, as well as some additional auxiliary results, are stated and proved 
in Appendices C and D. □ 
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Let us now discuss the implications of this theorem. First of all, we observe 
that when K and A are fixed, the inflicted performance loss of the proposed 
scheme relative to the centralized CUSUM is bounded, i.e. J[S] — J[S] = 
0(1) as 7 — > oo, only if 9 k -> so that 9 k log 7 = 0(1) for every 1 < k < K. 

From Lemma 2 we know that 8 k — > as d k — > 00, however it would be 
more interesting to have an explicit divergence rate for d k in terms of 7. It 
is easy to do so when each u k has bounded support, so that e k k < 00 and 
e k k < 00. Indeed, in this special case it follows from (5.3) (see also (B.10)) 
that k = 0(d k ), which means that the proposed scheme achieves second- 
order asymptotic optimality, when K and A are fixed, only if d k — > 00 so 
that d k = 0(log 7 ) for every 1 < k < K. In other words, the number of bits 
required for each transmission of sensor k (for fixed K and A) should be of 
the order 1 + 0(loglog7) for every 1 < k < K. 

When we have low communication rate in the sense that A — > 00, second- 
order asymptotic optimality cannot be preserved even when 9 — > 0. However, 
it can be preserved in a large sensor network (K — > 00), as long as there is 
a sufficiently high communication rate, that is A — > so that KA = 0(1), 
and additionally 8 — > so that 01og7 = 0(A). 

First-order asymptotic optimality is achieved under significantly less strict 
conditions. Indeed, from Theorem 2 and Lemma 6 we have 

J[S\ - J[S] < _6_ + 6(A) 



J[S] - 0(A) + i 

Therefore, D-CUSUM is first-order asymptotically optimal when 

(5.12) — — > 0; — — > 00; and A = o 



A K 1 \ K 

If 9 is bounded away from 0, i.e. if {d k , 1 < k < K} are fixed, then first-order 
asymptotic optimality of D-CUSUM requires a low communication rate, i.e. 
A — > 00, but at an order which is smaller than the order of the ratio log j/K 
which must tend to 00. On the other hand, if d k — > 00 for every 1 < k < K 
so that 6 — > 0, first-order asymptotic optimality is possible even when A is 
fixed. 

The best possible upper bound for the inflicted performance loss of D- 
CUSUM is 

(5.13) J[S) - J[S] < O{^K0 log 7) 



and it is attained when A, 9 and K are selected to equate, in order of mag- 
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nitude, the two terms of the upper bound in (5.10). This happens when 
(5.14) A = e 



'#log7 



K I 

Therefore, when K and 9 are fixed and A is selected using the previous 
relationship, D-CUSUM has the same asymptotic performance loss as Mei's 
scheme, that is J[S] — J[S] = 0(\/\og^). However, its performance is sig- 
nificantly improved when 9 — > 0, even if K, A — > oo. 

We conclude that D-CUSUM in discrete time has essentially the same 
asymptotic behavior as in continuous time, as long as the parameter that 
controls the overall quantization error, 9, goes to zero as 7 — > 00. However, 
the very interesting implication of our analysis is that this can be achieved 
with a very small number of bits transmitted at each communication by 
each sensor. This is also verified by a small simulation experiment that we 
present in the next section. 

Finally, we should note that when each sensor k transmits to the fusion 
center (with an "infinite-bit" message) the exact value of Z^ at each time 
r„, then (5.10) remains valid with 9 = 0, similarly to the continuous-time 
setup of the previous section. Of course, the resulting detection rule will be 
more efficient than any version of D-CUSUM that uses finite-bit messages. 

5.5. Simulation experiments and comparisons. In this section, we illus- 
trate the performance characteristics of S in the case of Gaussian obser- 
vations. Thus, we assume that each sensor k takes independent, normally 
distributed observations with variance 1 and mean that changes from to 
Hk 7^ at time r. Therefore, the local log-likelihood ratio process in this 
example is a Gaussian random walk, in particular 



* 2 
4 



(5.15) 



t€N. 



n=l 



Note that the distribution of u\ under Po is the same as that of -u\ under 
Poo, consequently the pair {r k ,Z k ) will have the same distribution under Po 
and Poo. 

We set A fc = A k = A k and e k = e k = e k for every j = 1, . . . , d k — 1, 
consequently it is also going to be A k = = A k for every j = 1 , . . . , d k — 1 . 
Moreover, we assume that /i^ = fi, d k = d and that A k is chosen so that 
F-obrl = r for every 1 < k < K. 

Our goal is to compare D-CUSUM S with Q-CUSUM S, which was de- 
fined in (3.5), when both rules use the same resources, i.e. the same num- 
ber of bits per communication (one or two) and the same (average) rate 
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Table 1 

Thresholds and Log- Likelihood Ratios 





Af 


Af 


r = 3, n = 1 


1.287 


1.87 


r = 6, fj, = 1 


2.54 


3.12 





Af 


Af 


Af 


Af 


r = 3, ij, = 1 


1.287 


1.87 


1.54 


2.94 


r = 6, fi = 1 


2.54 


3.12 


2.80 


3.62 



(a) d k 



(b) d k 



of communication. Notice that such a legitimate comparison is not possible 
with decentralized rules, such as Mei's scheme, that cannot control explicitly 
their transmission rate. Of course, the ultimate benchmark is the centralized 
CUSUM test, which requires transmission of the complete sensor observation 
at every sampling time. 

For n = 1, we present in Table 1 the parameters required for the imple- 
mentation of the D-CUSUM when the number of transmitted bits is d = 1 
or d = 2 and the communication period is r = 3 or r = 6. Fig. 1 and Fig. 2 
depict our simulation results. First of all, we observe that in all cases the op- 
erating characteristics of D-CUSUM S are essentially parallel with those of 
the optimal centralized CUSUM, S. This is exactly the order-2 asymptotic 
optimality that we established theoretically. On the contrary, the perfor- 
mance loss of Q-CUSUM S diverges in every case except when we transmit 
information with an infinite number of bits. This is expected since, as we 
argued before, this scheme is not even asymptotically optimal of order-1. 
Moreover, we observe that in all cases D-CUSUM is significantly more effi- 
cient than Q-CUSUM, since with one or two bits it is either very close or 
outperforms the infinite-bit Q-CUSUM. 

Finally, it is worth mentioning that the performance difference between 
the one-bit and the infinite-bit D-CUSUM is only a minor percentage of the 
overall detection delay. This is particularly apparent when the average com- 
munication period is large, (compare Fig. 1, where average period is r = 3, 
with Fig. 2, where average period r = 6). Indeed, since larger communication 
periods imply larger values for the main thresholds A k , A fc , the performance 
loss due to the unobserved overshoots is reduced. As a result, we do not ex- 
perience significant gains by having the sensors transmit additional bits to 
the fusion center, which suggests that D-CUSUM enjoys in practice second- 
order asymptotic optimality even with one-bit transmissions. This is similar 
to the continuous-time case where, as we proved in Section 4, one-bit trans- 
missions suffice for exact second-order asymptotic optimality. Summarizing, 
we can say that simulation experiments corroborate the conclusions of our 
asymptotic analysis. 
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Fig 1. Case of K = 5 sensors with communication period r = 3. 
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Fig 2. Case of K = 5 sensors with communication period r = 6. 



6. Conclusions. In this work, we formulated the problem of decentral- 
ized sequential change detection under a setup which takes into account 
quantization and rate constraints. We argued that this formulation is more 
appropriate for applications that rely on sensor networks, where the goal is 
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to design an efficient detection rule while controlling the overall communi- 
cation load. Moreover, we presented existing decentralized schemes under 
a unified framework that highlighted the communication activities they re- 
quire. 

We suggested a novel decentralized detection rule, according to which a 
sensor communicates with the fusion center at two-sided exit times of its 
local log-likelihood ratio process. The fusion center then uses a CUSUM-like 
rule in order to detect the change. The design and analysis of this scheme 
depend heavily on whether the sensors observe continuous-time processes 
with continuous-paths or random walks in discrete time. However, in both 
cases we showed that, with an appropriate design, the proposed scheme has 
essentially the same behavior. More specifically, its performance loss with 
respect to the centralized CUSUM is bounded (second order asymptotic 
optimality) for any fixed communication rate and any number of sensors, as 
long the quantization error associated with each transmission is small. On 
the other hand, first-order asymptotic optimality is preserved even with a 
low communication rate and a large number of sensors. 

Additionally, we illustrated with simulation experiments that the pro- 
posed scheme is very efficient in practice and that it performs significantly 
better than the corresponding decentralized scheme that communicates mes- 
sages of the same information content, at the same rate, but at deterministic 
times. 

Finally, we should emphasize that we were able to remove the assumption 
of independence across sensors in the special case that the sensor processes 
are correlated Brownian motions. However, it remains an open problem to 
establish general, asymptotically optimal, decentralized detection rules when 
the sensors take correlated observations. 

APPENDIX A 
In what follows, we will use the following notation 
(A.l) S x = mi{t >0:y t >x},S x = mf{t >0:y t >x}, 

where x > 0. We also define the function 
(A.2) i;(x):=E 00 [-us x ] = E o[(u)s x }, x>0, 

and from (4.1) we conclude that 
(A.3) ip(x) = e x - x - 1, x > 0. 

Finally, as we have done so far, for any given 7 we denote by v and v the 
thresholds for which Eoo[— ug u } = E-oo[— u§_] = 7. Before presenting the proof 
of Theorem 1 we need the following auxiliary lemma. 
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Lemma 3. For any 7>0w have the following double inequality which 
is true with probability 1 

(A.4) S^ 2C <Si>< S 9+2C , 

where C := -ftTA max and A max := maxi<fc<^- max{A fe , A fc }. 

Proof. By definition, u k coincides with u k at the communication times 
( r n)neN- Moreover, by (4.2) it is clear that the distance \u k — u k \ cannot 
be larger than max{A fc , A k } between any two consecutive communication 
times. Thus, for any t > we have 

K 

\u t - u t \ < \ u t - «t I < KA max = C. 

k=l 

As a result, for each < s < t we have ut — C < ut < ih + C , 

inf u s — C < inf u s < inf u s + C, 

0<s<t 0<s<t 0<s<t 

and applying these two inequalities in the definition of the statistic yt in (4.5) 
we obtain y t — 2C < yt < Ut + 2C. Finally, due to this pathwise inequality, 
we immediately deduce that 

inf{t : y t - 2C > 9} > inf{t :y t >v}> inf{t :y t + 2C> v}, 

which is the claim of the lemma. □ 

Proof of Theorem 1 . From the monotonicity of the quadratic varia- 
tion process (u) and the previous lemma we have 

(A.5) EooKu)^,,] < E 00 [(n) iS -J < E^s^l 

Since E 00 [(«)^ £ .] = E 00 [(u),s J/ ] = 7, we obtain 

Eoo[(«)5 B _ ao ] ^ E oo[(n).sJ < Eoo[(^)5 p+2C ] 

and using (A. 2) we can write ^(1/ — 2C) < V'( zy ) ^ ^ip + 2C). From (A. 3) 
it is clear that ip{-) is strictly increasing, which implies that \v — v\ < 2C. 
Therefore, 

J [S^] - J[S V ] < J[S 0+ 2c) - J[S V ] 

= (e- p - 2C - e- v ) + {D-u) + 2C< AC, 

where the first inequality follows from the previous lemma, the equality is 
due to (4.1) and the second inequality is due to the fact that \v— v\ < 2C. □ 
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APPENDIX B 

Proof of Lemma 1. With a change of measure i— > P$ we have 

Poo(4 = i) 



e J E 



{4=j} 



Po(<=i)e-^E 



J 



Thus, from the definition of A k in (5.6) we have 



-A fe 
e ■? 



Poo { z n ~ J i 



-A k c r - 



M4 = j) 

and taking logarithms we obtain the first equality in (5.7), that is 



A 



#-A*) | z k 



-3 <A* = -logEoK — ->'\zz = 3\ 
The second equality can be shown in a similar way. 



□ 



Proof of Lemma 2. In order to prove (5.9), we recall (5.5)-(5.6) and we 
argue as follows: 



nk _ nk 



(B.l) 



< 



--■Vn 



3=1 



where the inequality is correct because i?j > 0, which follows from (5.7), 
and £ k +A k < on {z£ = -j}. It remains to show that Eo[?7^] is bounded by 
a quantity that goes to as d k — > oo. In order to do so, we denote by Po the 
probability measure Po conditional on the event {z k > 0} = {£ k > A k } and 
by Eo the corresponding expectation. Then, taking expectations in (B.l) we 
have 

d k 

E„b£] = [ E 0[(^n " A i) l{**=i}] + P (^ = "J 

i=i 

(B.2) < £)[Eo[(4 - AJ) l {4=j} ] +4 P (4 = 



3=1 
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and it suffices to upper bound appropriately every term in (B.2). 

First of all, for any 1 < j < d k — 1 we have £ k — A k < e k — e k _ 1 on 
{ z n = j}- Therefore, taking expectations and recalling (5.3) we have 

Eo[(** - A*) l w=i} ] < (e} - ef-i) Po(4 = i) < ^, 
and consequently 

(B.3) £ E [(^ - A*) t {zk=j} ] <(d k -l£< 5 k , 

3=1 

where S k is a finite constant defined as follows: 

(B.4) 5 k := max max{ef - ef_i , 4 ~ ^-i}- 

l<j'<c2 fe -l 

Since A^ = A k + e^ fe _ 1 and e^ fc = essupuj 1 we have: 

(B.5) E [(l k -A k k )t {zkn=dk} ] = j_ k d P (l k -A k >x)dx. 

e d k -i 

From [12, Theorem 4, Eq. (13)] we obtain 

P„(<; - A* > x) < ^ (-_j±- j E [(2«* - x)l M5x) ] 

^l^R]( 1 + ^) EoKl «^> ] ' 

where D = E [((i^)+) 2 ]/E K] < E [(^) 2 ]/E [nf] < oo. Now note that for 
A fc larger than some limiting value which is bounded away from 0, we have 
(2/Eo[uj : ])(l + D /A k ) < c k , for properly selected constant c k that does not 
depend on A k and d k . Using this upper bound in (B.5) and applying Fubini's 
theorem we obtain 

Eo[(# - A k ) t {z u =dk} ] < c k J E [u k t {u ^ x} ]dx 

d K 

(B.6) = c k E [u k (u k - 4^)+] < c k E [(u\)H [u>lkk _ } \. 
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For any 1 < j < d k , with a change of measure Po i— > Poo we obtain 

Po(^ = -i) = Eoo[e £ il { z u = _ j} ] 
(B.7) i 

< Poo (4 = -j) < Poo (4 = "J) = ^, 

since ^ < — A fc < on {z k < 0}. Therefore, for any 1 < j < d k — 1 we have 

Po(4 = -j)^<^, 

since from (5.7) it follows that Rj<§.j — e^ ! _ 1 < <5 fc on {z^ = j} for 1 < j < 
d k — 1, and consequently we obtain 

d k -i <- k 
(B.8) £ P (zf = -J)R) < (d h - 1)^ < S k . 



Finally, from an application of the conditional Jensen inequality in the 
second equality in (5.7) we have 

Po(** = -d k )R k k < PoCs* = -d k ) E*^* + A** | z k = -d k ] 
(B .9) " PM = ~d k ) °° [( n + ^ {4= - dfc}] 

< Eoo[(**+ a;*) 

^E^u*) 2 !^ } ], 

1 1 d fc -l J 

where c fc is a constant term as A k ,A k ,d k —> oo. The second inequality 
follows from (B.7), whereas the third inequality can be shown in a similar 
way as (B.6). 

Therefore, if we apply (B.3), (B.6), (B.8) and (B.9) to (B.2), we obtain 
Eo[??n] ^ ^ fe i where 

(B.10) 9 k := 25 k 

+ 2max{c fc E [(u k ) 2 t {u ^ k ^ } ] , c k Eoo[(^) 2 l W <_^ fc _ i} ]}, 

In order to complete the proof we need to show that 9 k — > as d k — > oo. 
This follows directly from the finiteness of the second moment of u\ and the 
definition of the thresholds {e k ,e k } in(5.3), which implies that 5 k — > as 
d k -)■ oo. □ 
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APPENDIX C 

Our goal in this Appendix is to prove Lemma 4, which connects the thresh- 
old v to the false-alarm period, 7. In order to provide an elegant proof of this 
result, we need to adopt an alternative representation of the fusion center 
policy (that we will use only in this Appendix). Indeed, since the implemen- 
tation of S requires only the knowledge of the transmitted messages at the 
fusion center, it is possible to describe the fusion rule without any reference 
to the communication times {r^}. Thus, let z n be the nth message that 
arrives at the fusion center and k n the corresponding identity of the sensor 
which transmitted this message. Of course, since time is discrete, there is 
non-zero probability that the fusion center may receive messages from two 
or more sensors concurrently. In this case, we enumerate the simultaneous 
messages in an arbitrary order and we keep the same order for the labels. 

We can then describe the flow of information at the fusion center by the 
filtration {C n } n£ n, where C n = ct((zi, k\) . . . , (z n , k n )). For any n G N we set 



P (ki,...,k nJ 

log 



/q j\ Poo(&l, • • • j k n ) 

P0O1, • • • ,z n \ki, . . . , k n ) 

Vn := og p~7^ TTki k~v 

and recalling the definition of the log-likelihood ratios A*, A* in (5.6), we 
have 

n d km 

(C.2) v n = £ £ t {Zm=j} - A*~ 

m=l j=l 

Then, the number of messages which the fusion center has received until an 
alarm is raised by D-CUSUM is given by the following {C n }-stopping time: 

(C.3) N = inf{n G N : v n - min v m > 9}. 

m=l,...,n 

The process {v n } and the stopping time J\f are closely related to {uf} and S, 
respectively. Their main difference is that {ttj} and S are expressed in terms 
of "physical time", whereas {v n } and M in terms of number of messages 
transmitted to the fusion center. If we denote by r n the time-instant at 
which the nth message arrives at the fusion center, then we can explicitly 
specify the following connection between these quantities: u Tn = v n and 
S = Tj^. In other words J\f denotes the number of received messages at the 
fusion center until stopping at time S. 
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After these definitions, we can now prove Lemma 4, which connects v to 
7 through an inequality that will be important for the performance analysis 
of S. 

Lemma 4. For any 7 > and v so that 
(C.4) 5<log7-bg(J 0O ), 
where 1^ is defined in (3.2). 

PROOF. We first observe that 

(C.5) 7 = E^l-ug] = K/oo EoofcS] > Too Eqo [AT] . 

The second equality follows from an application of Wald's identity, whereas 
the inequality from the fact that N < KS. Indeed, the maximum number 
of received messages until stopping at S is obtained when at every time 
instant we have all sensors transmitting a message to the fusion center and 
this yields KS. 

From (C.5) it is clear that it suffices to prove Eqo [.AT] > e v . In order to 
do so, let us define the sequence {rij} of epochs where the CUSUM process 
v n — mino< m < n v m either returns to zero (restarts) or exceeds v. This is the 
classical way to write the CUSUM stopping time as a sum of a random 
number of components. Specifically, let us define 

rij := inf{n > rij-i : v n - v n] _ x $ (0, v)} 
Tl := inf {j e N : v nj - v nj _, >D}. 

Then we clearly have N = n-ji. Since from one epoch to the next we count 
at least one additional message, we trivially conclude that 1Z < J\f and, 
therefore, Eqo [TZ] < EoofjV"]. We can now claim that it suffices to show that 

(C.7) Poo(ft>j) > (l-e~T, VjGN. 

In order to justify this claim, observe first that Eoo[A^] < 00, since is a 
CUSUM stopping time. As a result, Eqo [72.] is finite as well and consequently 
(C.7) implies that 

00 00 
Eoo[A/] > Eoo^] = Poo (72 > 3) > 53(1 - e-y > e*. 

j=0 j=0 

In order to prove (C.7), we start with the following observation: 

(C 8) Poo{1Z > j) = P ^ U>j ~ 1; Vn > ~ Vn ^ ~ 0) 

= Poo(K>j-l)-P 00 (R,>j-l;v nj -v nj _ 1 >v). 
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Let us now set A := {TZ > j — 1 , v n - — v nj _ 1 > u}. Then, it is clear that 
A £ C nj and with a change of measure Poo \— > Po we obtain 

(C9) P^A) = J £~UP , 

where for every n £ N we define 

(CIO) C n :=e^ +Vn , n G N. 

We now argue as follows 

Poo (A) = [ C-\ . e -(^-^-i)-K-^-i) dP 

J A 3 

(C.ll) 



<e~ 5 / e -(^-^-i) dP 

J A 3 



Jn>j-\ 3 

= e~* [ C-j 1 E Q {e-^-^)\C n ._A dP . 



m>j- 

The first inequality is due to the fact that v nj — v nj _ 1 > v on the event 
A. The second inequality holds because A C {TZ > j — 1}, whereas the 
last equality follows from the law of iterated expectation and the fact that 
{TZ > j — 1} G C rij _ 1 and is a C rij _ 1 -measurable random variable. 

Suppose now that 

(C.12) E [e-^-^- l) |C Rj _ 1 ] = l, 

(a claim that we will prove shortly). Then, it is clear with a change of 
measure Poo h-> Po that (C.ll) reduces to 

(C.13) Poo{A)<e- v C-] dP = e-»P OQ (K >j-l). 

Jn>j-i 

Substituting the outcome of (C.13) in (C.8) and applying it repeatedly yields 

Poo(^ > j) > (1 - e- p )Poo(^ > j - 1) > (1 - e~T. 

Thus, in order to complete the proof it remains to justify (C.12). Since 
n gN is a {C n }-martingale under Po, as a likelihood-ratio, and n,j—i,n,j 
are {C n }-adapted stopping times, it suffices to show that the optional sam- 
pling theorem can be applied. More specifically, since Uj—i < rij, we need to 
show that 

(C.14) Eo[e-*V]<oo and lim E [e~^ 1 { nj>m} ] = 0. 
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Indeed, since <f) nj is a C nj -measurable random variable, with a change of 
measure Po i-> Poo we have 

(C.15) Eo[e-*y] = Eoo[e-^ £ n .\ = ^[cS] < oo, 

and the last term is finite, since v nj . cannot exceed v by more than Y2k=l ^-j > 
which is a finite quantity. 

With the same change of measure argument we can show that the second 
condition in (C.14) is satisfied. Indeed, 

Eo[e-*»l {n . >TO} ] = Eoote"*™ £ m l { „ J>m} ] = E^e"-^™^ l {rij>m} ] 
= Eoo^" 1 l {n . >m} ] < e° P 00 (n j > m), 

where the last inequality is due to the fact that v m < v on {rij > m}. Since 
nj is almost surely finite, we have Poo( n j > m ) — > as m — > oo, which shows 
that the second condition in (C.14) is also satisfied and this completes the 
proof. □ 

APPENDIX D 

Our goal in this Appendix is to state and prove Lemmas 6, 7, 8 and 
9, which are used in the proof of Theorem 2. In order to do so, we will 
need Lemma 5, which provides a very useful for our purposes, asynchronous 
version of Wald's identity. We recall that m\ is the number of messages that 
have been transmitted by sensor k to the fusion center up to time t and 
we denote by mt the number of messages that have been transmitted by all 
sensors up to time t. , i.e. 



(D.l) := max{n G N : < t}, m t : = ^ 



A 



Lemma 5. Consider a generic sequence where each Cn * s an ar - 

bitrary (Borel) function of the triplet (t„ — r™_ l3 z„,^„). Thus, {(%} is a 
sequence of independent and identically distributed random variables un- 
der both Pq (and Poo)- If T is a P Q-integrable \J^t\- stopping time and 
EoflCil] < °°; then 



(D.2) 



n=l 



E [m k T + l]E [d fc ; 
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(D.3) 



4 



n=l 



<(E [mf] + l) E [d fe ]. 



Finally, if \Cn\ — M k , where M k is some finite constant, then 



(D.4) 



n=\ 



>E [mf] E [d fc ]-2M fe . 



Proof. The proof can be found in [6]. □ 

Lemma 6. The optimum performance can be lower bounded as follows 

(D.5) EoM > log 7 - 0{K) 

Proof. Let us first define for any r > the stopping times 

T+ = inf{t > : u t > r}, T~ = inf{t > : -u t > r}. 

Due to the representation of the CUSUM stopping time as a repeated SPRT 
with thresholds and we have the following well-known formula (see for 
example Siegmund, [37, Page 25]) for its expectation under Pq and Poo 



(D.6) 



E, [u s ] 



Ei[«r] 



Pi{u T > vY 



0, oo, 



where T = min{T , T+} is the SPRT stopping time with boundaries and 
v. Using (D.6) for i = 0, we can now write 



Eoks] 



E o['"r 1 {u r >v}] + E o[-"rl{« r <o}] 



(D.7) 



> v 



P (u T > v) 
E [(-n r )l {ur < 0} ] 



P (u T > v) 

We start with the numerator and with a change of measure we have 
(D.8) E [-u r t{u T <o}] = Eoo[e nr (-n r )l {Mr < 0} ] < E^-url^^y]. 
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We can now strengthen this inequality as follows: 

Eoo[--Url{« r <0}] = E oo[-^-1{7- o -<t;+}] < Eoo[~U n 

< f r" 1 <r E oo[(^i) 2 ] 

< sup Eoo [-u r - - r < — — 

r>0 ' r Eqo -«i 



Tk 



- 2 



In 



where /< = ^ £f =1 ^ ^ the average, over all sensors, of the Kullback- 
Leibler information numbers and af := Efc^i ^ &1 i{ u \} the average, over 
all sensors, of the variances of the local likelihood ratios u\, under the prob- 
ability measure Pj, i = 0, oo. The second inequality in the second line in 
(D.9) follows from Lorden's [12] upper bound for the average overshoot, 
strengthened by observing that (u^) 2 < (u±) 2 . 

Furthermore, for the denominator in (D.7) we have 



(D.10) 

Po(n T >v) = Po(T+ < T -) > P (T - = oo) 



1 = Kip 
E [T+] E [u T + 

\2 f2 



> KIo > [KIqY I£ 



sup r > E [u T + - r] Kal + (KI ) 2 K~ l a$ + I 2 



> 



The second equality in the first line is a classical result of random walk 
theory (see for example Siegmund [37, Corollary 8.39, Page 173]), whereas 
the third equality in the first line is an application of Wald's identity. The 
second inequality in the second line is again the upper bound provided by 
Lorden [12] for the overshoot, while the last inequality is true because K > 1. 
From (D.8), (D.9) and (D.10) we obtain 

E [(-n r )l {Mr < 0} ] < al + Kjl^f aj + J 2 = 

Po(ur>^) " loo II 

and consequently from (D.7) it follows that Eofus] > v — Q(K). It remains 
to find a lower bound for 7 in terms of v. From the false alarm constraint 
and (D.6) we have 

Eoo[-^r] 



(D.ll) 7 = Eoo[-«s] 



Poo(^T > v)' 
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For the expectation in the numerator, we can obtain the following upper 
bound 

Eool-ur] = Eoof-Urljnr^O}] + E°o[-UTt{ UT >v}] 

- 2 

(D.12) < Eoohurl^o}] < ^ + K/oo, 

loo 

where the final inequality follows from (D.9). In order to obtain a lower 
bound for the probability Poo( u T > u ) lli the denominator we start with a 
change of measure, thus 

(D.13) P 00 (n r >u) = E [e- u rt {uT ^ } ] = E [e-^|n r > v]P (u T > v). 
Then, with an application of the conditional Jensen inequality we have 

Eo[e~ UT \uT > u] > exp(— Eo[u-j-\uj- > v\) 

E Q [(u T -v)t {u >v} \ 

> exp —v — — ' ~ 

" V Po(^r > v) 

(D.14) > exp (_ v _ sup^qEq^ - r] 



> exp 



P {u T > v) 
P (u T > v) 



where in the last inequality we have used, again, Lorden's [12] upper bound 
for the maximal average overshoot. Combining (D.13) and (D.14) we obtain 

f + KIo 

Poo(u T > v) > exp I -v - ^ — | P (u T > v) 

>exp' • 10 ' 



rr 2 + f 2 ' 



where the second inequality follows from (D.10). Then, from (D.ll), (D.12) 
and (D.15) we have 

/cf 2 \ I $+Kl \ , p 

7< ^+^=0 expL+ Z » W J o 



Taking logarithms we obtain log 7 < Q(logK) + v + KQ(1), which implies 
that log 7 < v + Q(K) and completes the proof. □ 
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Lemma 7. For the expectation of Eo[uj] we have the following upper 
bound 

(D.16) Eq[u § ] < log 7 + ^0(A) 

Proof. Let us first of all define: 
(D.17) M := max max {A*? Af}, 

l<k<K l<j<d k 3 ~° 

It is clear that KM is an upper bound for the overshoot — u, thus 

(D.18) % < y s < v + KM < log 7 - log(7oo) + KM, 

where the first inequality is due to the fact that info< s <t u s < 0, whereas the 
third inequality is due to Lemma 4. The quantity log(Ioo) is independent of 
the thresholds {A k ,A k } and remains bounded as K — > 00, thus log(7oo) = 
0(1). Therefore, it remains to show that M = 0(A). Indeed, for each 1 < 
j < d k and 1 < k < K we have: 

(D.19) A k < A k < A k = A k + Rj. 

k 

From the proof of Lemma 2 it is clear that each Rj is an O(l) term as 
A — > 00 (and o(l) as d k — > 00), thus A^ is bounded above and below by a 
0(A) term, which proves that A^ = 0(A). Similarly it can be shown that 
A k = 0(A) and this completes the proof. □ 

Lemma 8. 

(D.20) E [«5 - u§] < KQ(A) + 9 E [m s ], 

where 6 := maxi<fc<i^ 9 k . 

Proof. We first observe that for any t and k we have 

u\ - u k = u k - u k k + u k k - u k k 

(D.21) m\ 

<A k + J2l£ k -£ k }< A k + ^ v k . 

n=l n=l 

The first inequality is due to the fact that u k — u k is upper bounded by 
A fc between transmissions, whereas the second inequality follows from (5.9). 
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If we now replace t with S, take expectation with respect to Po and apply 
Lemma 5 (in particular, (D.3)), we obtain 

(D.22) E [4 - u k s ] <A k + (E [m|] + 1) E fo*] < A k + 9 k + fe E o [m|], 

where the second inequality follows from Lemma 2. Then, summing over k 
we obtain 

K K K 

(D.23) E [^ - u~\ < £ E ol4 " 4] ^ E( A " + **) + E °l m % 

k=l k=l k=l 

which implies (D.20), since A k = 6(A) and 6 k = 0(1) as A ->• oo. □ 

Lemma 9. For t/ie average number of messages received by the fusion 
center up to time S we have the following bound 

(D.24) EoM<^ + #0(l). 

Proof. For every /c we have ii^ = X^n=i an d l^nl — f° r ever y n ) 
where M is defined as in (D.17). Therefore, from Lemma 5, in particular 
(D.4), it follows that 

(D.25) E [u k s ] > E [m|] E [l k ] - 2M. 

Then, summing over k we obtain 
K 

E [%] > £[E [m|] E o[^] - 2M] > ( mm K E [lf]) E [m 5 -] -2KM 
fc=i 

and consequently 



(D,6) E 0K , < E «±^ < ■o gT - 1 o g( r.) + 3*M 



where the second inequality is due to (D.18). Since M = 0(A), which was 
shown in the proof of Lemma 7, it suffices to show that Eo[^x] > 0(A) for 
every 1 < k < K, since in this case (D.26) gives 

m97 , F[ 1 ^ iog7 + e(i) + ire(A) io g7 

(D-27) EoW < 9(A) " 6(A) + A 0(1) ' 
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Indeed, since > A k > A k and Aj < M, we obtain 

d k 

E [iS = £ [iSj P„(*J = j) - A k P„(*f = ~j) 

(D.28) J=1 

> A k P (z k > 0) - M P {z\ < 0) 

= A k - (A k + M) P„(zf < 0) = 6(A) + o(l). 
The last equality is due to the fact that M = 0(A) and 

A fc P (^ < 0) = A fe e-^E 00 [e^+^ l W< _ A fc } ] = A fc e-^ 0(1), 

since from renewal theory it is well-known that Eoo[e^i + — k l{^<_A fe }] * s an 
asymptotically convergent, thus bounded, term as A k ,A k — > oo. □ 
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